A Scoring Method for the Clustering of Nucleic Acid Sequences

نویسندگان

  • Barileé Barisi Baridam
  • A. Ben-Dor
  • R. Shamir
چکیده

The clustering of biological sequence data is a significant task for biologists. The reason is that sequence clustering assists molecular biologists to group sequences based on the ancestral traits or hereditary information that are hidden in sequences. To accomplish the similarity detection and clustering tasks, several clustering algorithms, similarity and distance measures have been proposed. Most of these algorithms and similarity measures manifest some form of inefficiency in the detection of sequences based on their structural similarity as was observed in the course of this study. In this paper, the codon-based scoring method (COBASM) is developed to handle this inefficiency. COBASM employs the codon principle, by the application of triplet nucleotides, in the clustering of nucleic acid sequences. The results obtained show that COBASM is able to produce compact and wellseparated clusters based on the structural similarity of sequences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating the Particle Swarm Optimization Clustering Method on Nucleic Acid Sequences

Particle swarm optimization (PSO) has been employed on several optimization problems, including the clustering problem. PSO has also been employed in the clustering of data of different structure and dimensionality. In this paper it is employed in the clustering of nucleic acid sequences. The application of clustering, as a statistical tool, in the analysis of data of varied complexity has been...

متن کامل

Phylogenetic and sequence analysis of the growth hormone gene of two sturgeons, Huso huso and Acipenser Gueldenstaedtii

In this study, the cDNA Growth Hormone (cGH) of the Belugasturgeon (Husohuso) and Russian sturgeon (Acipensergueldenstaedtii) were cloned and sequenced, and phylogenetic relationships were examined using nucleic acid and amino acid sequences. The nucleotide sequence of the Beluga GH has an open reading frame of 645 nucleotides encoding a protein 214 amino acid residues. The signal peptide cleav...

متن کامل

Signal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases

Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have beenidentified. So far, various mathematical algorithms for the clustering of chitinase genes have been used butmost of them seem to be confusing and sometimes insufficient. In the...

متن کامل

Designing a Label Free Aptasensor for Detection of Methamphetamine

A label-free electrochemical nucleic acid aptasensor for the detection of methamphetamine (MA) by the immobilization of thiolated self-assembled DNA sequences on a gold nanoparticles-chitosan modified electrode is constructed. When MA was complexed specifically to the aptamer, the configuration of the nucleic acid aptamer switched to a locked structure and the interface of the biosensor changed...

متن کامل

Optimization of the Analysis of Almond DNA Simple Sequence Repeats (SSRs) Through Submarine Electrophoresis Using Different Agaroses and Staining Protocols

Simple sequence repeat (SSR markers or microsatellites), based on the specific PCR amplification of DNA sequences, are becoming the markers of choice for molecular characterization of a wide range of plants because of their high polymorphism, abundance, and codominant inheritance. Different methods have been used for the analysis of the SSR amplified fragments being submarine agarose electropho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012